24 - (Lecture 8, Part 3) Dense Motion Estimation [ID:32180]

50 von 161 angezeigt

Hello everyone and welcome back to computer vision lecture series. This is lecture 8 part

3. In this lecture we are going to continue talking about dense motion estimation. In

the last lecture we saw how parametric motion can be resolved by using our nonlinear least

squares methods. And for different transformation for example if the movement is translation

or it's a Euclidean type or is there a similarity type or affine or even projective. For each

and every of this kind of transformation we can just use its related Jacobians as mentioned

in this matrix form. And we can solve this Jacobian and find the unknowns which is the

movement or the motion vector. And then use it for computing your motion between the two

frames or two images whatever is the case. And today we are going to continue talking

about how we can estimate optical flow and how many different methods can be used for

calculating or estimating optical flow. Optical flow after estimating we can find the generic

motion flow or the generic motion estimation of your between your frames in the video or

between two images. So the most general form of optical flow can be represented by this

equation where u i represents each and every pixel in your given image i0. And so using

this error metric you can find the motion or the optical flow for each and every pixel.

So essentially this is very dense and for every pixel you will have a motion vector.

However it's inefficient to compute it's a highly complex problem and it has a large

number of unknowns and we have not enough information available. And it's not always

the case that we need this every pixel level information. And the obvious solution we already

know that for each pixel in i0 we find some random some pixel in i1 with the same color

for example. So for every pixel you will have let's say if you are using an RGB color space

with a range of values of 0 to 255 then you can have 256 color for every pixel. And so

let's say you have a pixel value of 120 and you just have to look for that pixel value

in your next frame to match these two pixels and estimate the flow between them. But let's

say if you have 1 million pixels in your input image then there will be in addition with

256 color there could be many many solution. And it's not necessary that every pixel will

have a unique solution because there are only 256 colors it is highly likely that for every

pixel value you have thousands of matches. So this obvious solution will not work. So

we move a bit ahead and think in terms of patches. So how do we do patch based optical

flow? So instead of looking in the whole image for a particular pixel value we look at patches

as well. So you have to define some patches in advance and you have to find these patches

in the next image and compute the optical flow between them. So in patch based approach

basically instead of for every pixel you have a local neighborhood for which you determine

the optical flow or its pixel for its determine the neighborhood to find the optical flow.

What it means is that you fix your neighborhood in your reference image and you look for that

kind of that neighborhood in your target image and when you find these matches then you compute

the optical flow for your given pixel value. You can use this patch based approach using

the same algorithms as we have discussed before. You can also use image pyramids as a feature

extractor or feature scaler to use patches. So if you reduce then image scale to the courses

first image scale there it's easier to form patches and then you find optical flow for

that using this patch based approach and then compute it for your whole neighborhood of

the pixel and then slowly scale up and then you refine your optical flow vector as we

have discussed in the previous part and in an iterative fashion and so on and so forth

and you will find for your original image scale the final optical flow value for each

and every pixel. However this problem is not the most efficient because let's say if your

movement is too high or if the frame difference between the two frames for which you are calculating

the optical flow is too high maybe a new object appears for example a very good scenario is

monitoring traffic right. If you are trying to take two images separated by a few seconds

there are a lot of different new cars or automobiles coming into your image frame and a lot of

objects which were there in your previous frame have left. So because of these occlusions

Teil einer Videoserie :

Computer Vision SS 2021

Presenters

Ronak Kosti

Zugänglich über

Offener Zugang

Dauer

00:18:59 Min

Aufnahmedatum

2021-05-03

Hochgeladen am

2021-05-03 17:38:31

Sprache

en-US

Tags

Per RSS abonnieren